10th World Congress in Probability and Statistics

Invited Session (live Q&A at Track 1, 11:30AM KST)

Invited 05

Recent Advances in Shape Constrained Inference (Organizer: Bodhisattva Sen)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 20 Tue, 10:30 PM — 11:00 PM EDT

Global rates of convergence in mixture density estimation

Arlene Kyoung Hee Kim (Korea University)

6
In this talk, we consider estimating a monotone decreasing density f_0 represented by a scale mixture of uniform densities. We first derive a general bound on the hellinger accuracy of the MLE over convex classes. Using this bound with an entropy calculation, we provide a different proof for the convergence of the MLE for d=1. Then we consider a possible multidimensional extension. We can prove, for d ≥ 2, that the rate is as conjectured by Pavlides and Wellner under the assumption that the density is bounded from above and below and supported on a compact region. We are exploring strategies for weakening the assumptions.

Convex regression in multidimensions

Adityanand Guntuboyina (University of California Berkeley)

8
I will present results on the rates of convergence of the least squares estimator for multidimensional convex regression with polytopal domains. Our results imply that the least squares estimator is minimax suboptimal when the dimension exceeds 5.

This is joint work with Gil Kur, Frank Fuchang Gao and Bodhisattva Sen.

Multiple isotonic regression: limit distribution theory and confidence intervals

Qiyang Han (Rutgers University)

8
In the first part of the talk, we study limit distributions for the tuning-free max-min block estimators in multiple isotonic regression under both fixed lattice design and random design settings. We show that at a fixed interior point in the design space, the estimation error of the max-min block estimator converges in distribution to a non-Gaussian limit at certain rate depending on the number of vanishing derivatives and certain effective dimension and sample size that drive the asymptotic theory. The limiting distribution can be viewed as a generalization of the well-known Chernoff distribution in univariate problems. The convergence rate is optimal in a local asymptotic minimax sense.

In the second part of the talk, we demonstrate how to use this limiting distribution to construct tuning-free pointwise nonparametric confidence intervals in this model, despite the existence of an infinite-dimensional nuisance parameter in the limit distribution that involves multiple unknown partial derivatives of the true regression function. We show that this difficult nuisance parameter can be effectively eliminated by taking advantage of information beyond point estimates in the block max-min and min-max estimators through random weighting. Notably, the construction of the confidence intervals, even new in the univariate setting, requires no more efforts than performing an isotonic regression for once using the block max-min and min-max estimators, and can be easily adapted to other common monotone models.

This talk is based on joint work with Hang Deng and Cun-Hui Zhang.

Q&A for Invited Session 05

0
This talk does not have an abstract.

Session Chair

Bodhisattva Sen (Columbia University)

Invited 06

Optimization in Statistical Learning (Organizer: Garvesh Raskutti)

Conference
11:30 AM — 12:00 PM KST
Local
Jul 20 Tue, 10:30 PM — 11:00 PM EDT

Statistical inference on latent network growth processes using the PAPER model

Min Xu (Rutgers University)

4
We introduce the PAPER (Preferential Attachment Plus Erods--Renyi) model for random networks, in which we let a random network G be the union of a preferential attachment (PA) tree T and additional Erdos--Renyi (ER) random edges. The PA tree component captures fact that real world networks often have an underlying growth/recruitment process where vertices and edges are added sequentially and the ER component can be regarded as random noise. Given only a single snapshot of the final network G, we study the problem of constructing confidence sets for the root node of the unobserved growth process, which can be patient-zero in a disease infection network or the source of fake news in a social media network. We propose inference algorithm based on Gibbs sampling that scales to networks of millions of nodes and provide theoretical analysis showing that the expected size of the confidence set is small so long as the noise level of the ER edges is not too large. We also propose variations of the model in which multiple growth processes occur simultaneously, reflecting the growth of multiple communities, and we use these models to derive a new approach community detection.

Adversarial classification, optimal transport, and geometric flows

Nicolas Garcia Trillos (University of Wisconsin-Madison)

4
The purpose of this talk is to provide an explicit link between the three topics that form the talk's title, and to introduce a new perspective (more dynamic and geometric) to understand robust classification problems. For concreteness, we will discuss a version of adversarial classification where an adversary is empowered to corrupt data inputs up to some distance \epsilon. We will first describe necessary conditions associated with the optimal classifier subject to such an adversary. Then, using the necessary conditions we derive a geometric evolution equation which can be used to track the change in classification boundaries as \veps varies. This evolution equation may be described as an uncoupled system of differential equations in one dimension, or as a mean curvature type equation in higher dimension. In one dimension we rigorously prove that one can use the initial value problem starting from \veps=0, which is simply the Bayes classifier, to solve for the global minimizer of the adversarial problem. Global optimality is certified using a duality principle between the original adversarial problem and an optimal transport problem. Several open questions and directions for further research will be discussed.

Capturing network effect via fused lasso penalty with application on shared-bike data

Yunjin Choi (University of Seoul)

5
Given a dataset with network structures, one of the common research interests is to model nodal features accounting for network effects. In this study, we investigate shared-bike data in Seoul, under a spatial network framework focusing on the rental counts of each station. Our proposed method models rental counts via a generalized linear model with regularizations. The regularization is made via fused lasso penalty which is devised to capture network effect. In this model, parameters are posed in a station-specific manner. The fused lasso penalty terms are applied on the parameters associated with locationally nearby stations. This approach facilitates parameters corresponding to neighboring stations to have the same value and account for underlying network effect in a data-adaptive way. The proposed method shows promising results.

Q&A for Invited Session 06

0
This talk does not have an abstract.

Session Chair

Garvesh Raskutti (University of Wisconsin-Madison)

Made with in Toronto · Privacy Policy · © 2021 Duetone Corp.